International Opinions on Grading of Urothelial Carcinoma: A Survey Among European Association of Urology and International Society of Urological Pathology Members

Take Home Message Both the World Health Organization (WHO) 1973 and WHO2004 grading systems are used widely. However, there was limited support for WHO1973 and WHO2004 in their current formats, while the hybrid (three-tier) grading system with low-grade, high-grade (HG)-G2, and HG-G3 as categories could be considered the most promising alternatives.


Introduction
As non-muscle-invasive bladder cancer (NMIBC) is a heterogeneous disease, accurate risk stratification, based on both clinical and pathological factors, is crucial to determine optimal treatment and surveillance strategies for each patient. Histological grade of NMIBC is an important prognostic factor for progression to muscle-invasive and/or metastatic disease [1,2].
The World Health Organization (WHO) adopted the first bladder cancer grading classification in 1973 dividing papillary urothelial carcinomas into grades 1-3 (G1, G2, and G3) [3]. The lack of clear histological criteria for the three WHO1973 grades with the majority of carcinomas being classified in the middle group (G2) prompted the proposal of a new grading classification scheme in 1998 [3,4]. This WHO/International Society of Urological Pathology (ISUP) scheme consisted of papillary urothelial neoplasm of low malignant potential (PUNLMP), and low-grade (LG) and high-grade (HG) noninvasive papillary urothelial carcinoma with more detailed histological criteria [4]. The PUNLMP entity was created to avoid the ''cancer'' label in a category of patients with a presumed low risk of recurrence [4]. Subsequently, the WHO/ISUP1998 classification was modified into a four-tier classification (WHO1999), in which HG was subclassified in G2 and G3 with more resemblance to the older WHO1973 grading system [5,6]. Ultimately, the WHO adopted the three-tier 1998 classification in 2004 and subsequently in 2016 [7] and 2022 ( Fig. 1) [8].
Unlike the WHO1973 system, the WHO2004 grading system was adopted on the basis of its clear histological definitions of each category but without clinical evidence supporting its prognostic value [6]. Eventually, Soukup et al. [9] compared the prognostic performance of WHO1973 and WHO2004 in a systematic review in 2017 and concluded that WHO2004 did not outperform WHO1973, yet suggested that larger studies with individual patient data (IPD) were needed since the review was based on a small number of clinical studies with relatively few patients (five studies with in total 931 patients for recurrence; seven studies with in total 1371 patients for progression). Therefore, van Rhijn et al. [10] collected IPD of 5145 patients to compare both grading systems and showed that the prognostic value of WHO1973 for predicting progression of primary Ta-T1 NMIBC was better than that of WHO2004 in a real-world setting. Moreover, a combination of both systems (LG/G1, LG/G2, HG/G2, and HG/G3; hybrid four tier) proved superior to either system alone ( Fig. 1) [10]. Their results did not support PUNLMP as a separate grade category because the prognosis was comparable with Ta-LG and the diagnosis had become extremely rare [11]. Subsequently, the ISUP established a multidisciplinary workgroup of nine pathologists and four urologists with long-standing expertise in the field to determine the requirements for an optimal classification system based on a literature review of clinical and molecular evidence. These 13 experts were also surveyed regarding several aspects of the WHO1973 and WHO2004 grading systems [12].
The aim of this current EAU-ISUP survey, composed of the same questions as for the multidisciplinary workgroup [12], was to ask a larger sample of urologists (EAU) and pathologists (ISUP) about their preferences regarding NMIBC grade to generate a multidisciplinary dialogue. This survey, in part, also informed a subsequent ISUP consensus   bladder conference meeting in September 2022, the proceedings of which will be published in a separate manuscript. The purpose of this survey was not to form guidelines or recommendations, as those should primarily be based on clinical evidence.

Patients and methods
With permission of the ISUP and the EAU central guidelines office, a web-based anonymous questionnaire was launched with ten questions on grading of NMIBC. All questions were multiple-choice type, and some questions had the option to provide comments, if the preferred option was not available. The invitation to participate in the survey was circulated by e-mails to the EAU and ISUP members by the end of 2021. Survey questions were similar to the survey questions in the study by van der Kwast et al. [12] and are displayed in the Supplementary material.
The answers of the surveyed experts were obtained as well [12]. Descriptive statistics were summarized as frequencies and percentages.
Most (59/64, 92%) of the pathologists from North America use only the WHO2004 grading system, while the majority (50/84, 60%) of the pathologists from Europe use both WHO1973 and WHO2004, followed by 32/84 (38%) who use only WHO2004. Four out of six (67%) experts from North America use the WHO2004, and the other two use both WHO1973 and WHO2004. All five experts from Europe use both WHO1973 and WHO2004.
With respect to the question of what a future grading system should look like, a minority (20%, including 14% urologists, 26% pathologists, and 8% experts) would like to keep WHO2004 as it is, that is, PUNLMP, LG, HG. This opinion is shared by more North American pathologists (31%) than European pathologists (18%). In total, 174/414 (42%) respondents would continue with the WHO2004 grading classification. However, more than half (90/174, 52%) would omit PUNLMP from WHO2004 and turn it into a two-tier grading system (LG and HG). Among urologists, there was limited support for using an unmodified WHO1973 grading system (27/174, 16%) or the current WHO2004 grading system without modification (24/174, 14%). Very few pathologists (8/221, 4%) would prefer the previous WHO1973 grading system as it currently exists (Fig. 5B).

Discussion
Histological grade of NMIBC is an important prognostic factor for progression to muscle-invasive and/or metastatic disease [1,2]. However, grading of NMIBC is a matter of continuing debate as it remains questionable whether the WHO2004 classification system actually improved risk stratification for clinical treatment and surveillance strategies [9][10][11]13]. The aim of this EAU-ISUP survey was to ask a large sample of urologists and pathologists from the EAU and ISUP about their preferences regarding grading NMIBC to generate a multidisciplinary dialogue on this contentious issue.
The most commonly utilized grading systems worldwide are the WHO1973 and WHO2004 classification systems. However, there is no international agreement on their use and the two classification systems are not directly translatable into each other due to different cutoff points between grades. The American Urological Association and the WHO recommend using the WHO2004 grading system, whereas the EAU supports the use of both grading systems [1,14,15]. Moreover, the ISUP recently suggested a hybrid three-tier (LG, HG/G2, and HG/G3) grading classification [12]. In this survey, we found marked variability in the currently utilized grading schemes as 53% use WHO2004 and 44% still use WHO1973 (together with WHO2004). The variability is partly dependent on geographical location as the majority-though not all-of pathologists from Europe (60%) use both WHO1973 and WHO2004, and most pathologists from North America (92%) use the WHO2004 grading system. Moreover, approximately half of the urologists-all EAU respondents-currently use both the WHO1973 and the WHO2004 grading system. Hence, there is no international consensus on NMIBC grading.
The PUNLMP category has been subject to debate since its introduction. Comparable recurrence and progression rates between PUNLMP and Ta-LG carcinomas were recently confirmed by Hentschel et al. [11]. In addition, a strong decline in PUNLMP diagnosis with a decrease from 31% before 2000 to only 1% after 2010, has been observed [11]. Therefore, Hentschel et al. [11] concluded that there was limited support to retain PUNLMP as a separate grade category. Nevertheless, from population-based data in the context of WHO1999, patients with primary LG/G1 tumors were reported to have an increased risk of recurrence compared with those with PUNLMP [16]. The controversy about this category was confirmed by our survey results, since 61% of respondents encounter a PUNLMP lesion never or rarely in daily clinical practice and barely 10% does so commonly. Moreover, nearly 60% of respondents believe that the management of PUNLMP should be similar to Ta-LG carcinomas. In addition, from the respondents who preferred to continue the WHO2004 grading classification in the future (n = 174), more than half would omit PUNLMP, thereby turning the WHO2004 grading system into a two-tier grading system consisting of only LG and HG. Taken together, the use of PUNLMP as a separate category within WHO2004 is also questioned by the current survey results.
The major criticism of the WHO1973 grading system has always been the lack of clearly defined criteria, particularly for grade 2 cancers (''not grade 1, not grade 3''), which consequently led to a ''default'' G2 diagnosis of a large clinically heterogeneous bladder cancer population. Nevertheless, it remains questionable whether the implementation of the WHO2004 classification solved the interobserver variability issues and actually improved NMIBC grading [9]. Even though the adoption of WHO2004 was meant to reduce interobserver variability by defining clear histological criteria, subsequent studies on the reproducibility of WHO2004 showed that observer variability did not really improve by formulating these detailed criteria for each grade category [9,13]. A recent study by van Rhijn et al. [10] compared both grading systems and showed that the prognostic value of WHO1973 for predicting progression of primary Ta-T1 NMIBC was better than that of WHO2004. Although WHO1973 may seem outdated, 44% of our survey respondents still use WHO1973 in daily clinical practice, mostly in the context of dual grading with WHO2004. Further, reporting of WHO1973-G3 subgroup in WHO2004-HG carci-nomas would have a clinical impact according to 69% of urologists and 77% of experts. Somewhat surprisingly, reverting back to WHO1973 would be an option for 73% of the survey respondents, yet for most (56%) only if further modifications are made to the system or grading criteria are more detailed.
Strikingly, retention of the current iteration of the WHO2004 grading system is favored by only 20% of survey respondents, and a small number (10%) of respondents would favor the WHO1973 classification in its current form. Approximately half the respondents would prefer hybrid three-or four-tier grading classification based on both WHO1973 and WHO2004, with the majority favoring three-tier grading, that is, WHO2004 minus PUNLMP and splitting up WHO2004-HG into WHO1973-G2 and WHO1973-G3. Similar results were found in the survey conducted among 13 ISUP experts, as only one would prefer WHO2004 as it is and most (n = 6, 46%) preferred the three-tier hybrid grading system option [12]. From a clinical point of view, van Rhijn et al. [10] showed in 5145 patients from 17 hospitals that a combination of both WHO1973 and WHO2004 (LG/G1, LG/G2, HG/G2, and HG/G3) was superior to either system alone [10]. Another contemporary study by Downes et al. [17] in 609 patients from two North American hospitals also reported that the hybrid three-or four-tier grading system is a better prognosticator for progression than either WHO1973 or WHO2004 alone. Both studies strongly support the separation of the WHO2004-HG category into WHO1973-G2 and WHO1973-G3 with clinically relevant differences in progression (van Rhijn et al. [10] 8% vs 19%; Downes et al. [17] 27% vs 44%). In con-trast, the separation of WHO2004-LG into WHO1973-G1 and WHO1973-G2 was found to be clinically less relevant because of the lower progression risks (van Rhijn et al. [10] 1% vs 4%; Downes et al. [17] 3% vs 4%). Likewise, an ISUP multidisciplinary workgroup on NMIBC grade recently suggested to subdivide the heterogeneous WHO2004-HG category into WHO1973-G2 and WHO1973-G3, based on both clinical and molecular evidence [12]. Based on our survey results, opinions on a future grading system are quite scattered. Nonetheless, there was clearly limited support for the continuation of both WHO1973 and WHO2004 in their current form while a hybrid (three-or four-tier) grading sys- tem received more support. Therefore, a hybrid grading system could indeed be a good option for BC grading in the future as it is also supported by clinical data [10,17] and previous suggestions to introduce such a system (WHO 1999) [5].
One of the limitations of our survey is that the number respondents represent a relatively small sample of all EAU and ISUP members. Another limitation is that all urologists are EAU members, and therefore the results predominantly reflect European urology practice. Nevertheless, since experts in the field, and both urologists and pathologists working in various parts of the world and in different hospital settings have responded, our results may be a reasonable reflection of reality. Obviously, the aim of this survey was not to establish guidelines or recommendations, as those should primarily be based on clinical evidence. We believe that our survey proved useful to evaluate the discordance/concordance between physicians (urologists/pathologists) when assessing the impact of grade on guidelines and in current, as well as, future daily clinical practice.

Conclusions
Grading of NMIBC is a matter of ongoing debate, and there seems to be no international consensus. The ''old'' WHO1973 system is still widely used in some form and in some jurisdictions. The present survey results among experts and members of the ISUP and the EAU showed that PUNLMP as a separate category was not used widely apart from by a minority in North America. Strikingly, reverting back to WHO1973 grading would be an option for the majority of pathologists and urologists, provided that grading criteria are more detailed in the future. The vast majority of the respondents believed that subdivision of the HG category of WHO2004 by WHO1973-G2 and WHO1973-G3 would influence clinical management of NMIBC. There seems to be only limited support among survey participants for the continuation of the existing WHO2004 (20%) and WHO1973 (10%) grading classification systems in their current form. Our survey and recent clinical data showed that a hybrid, preferably three-tier (based on both WHO1973 and WHO2004 with the categories LG, HG-G2, and HG-G3), grading system seems to be a promising, clinically prognostic alternative for the future.